Goto

Collaborating Authors

 promise and challenge


A Preliminary Study on the Promises and Challenges of Native Top-$k$ Sparse Attention

Xiu, Di, Tang, Hongyin, Rong, Bolin, Yan, Lizhi, Wang, Jingang, Lu, Yifan, Cai, Xunliang

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly prevalent in the field of long-context modeling, however, their inference computational costs have become a critical bottleneck hindering the advancement of tasks such as agents and multimodal applications. This report conducts a preliminary investigation into the effectiveness and theoretical mechanisms of the Top-$k$ Attention mechanism during both the decoding and training phases. First, we validate the effectiveness of exact Top-$k$ Decoding through extensive experimentation. Experiments demonstrate that retaining only the pivotal Keys with the highest similarity to the Query as the context window during the decoding stage achieves performance comparable to, or even surpassing, full attention on downstream tasks such as HELMET and LongBench v2. Second, we further explore the native Top-$k$ Attention training strategy. Experiments confirm that ensuring the consistency between training and inference regarding Top-$k$ Attention operations facilitates the further unlocking of Top-$k$ Decoding's potential, thereby significantly enhancing model performance. Furthermore, considering the high computational complexity of exact Top-$k$ Attention, we investigate the impact of approximate Top-$k$ algorithm precision on downstream tasks. Our research confirms a positive correlation between downstream task performance and approximation fidelity, and we provide statistical evaluations of the Lightning Indexer's precision within the DeepSeek-V3.2-Exp model. Finally, this report provides a theoretical interpretation from the perspective of Entropy. Experimental observations indicate that models subjected to Top-$k$ Attention SFT exhibit a distinct phenomenon of entropy reduction in downstream tasks, which validates the hypothesis that low-entropy states are better adapted to Top-$k$ Decoding.


The Promise and Challenges of Using LLMs to Accelerate the Screening Process of Systematic Reviews

Huotala, Aleksi, Kuutila, Miikka, Ralph, Paul, Mäntylä, Mika

arXiv.org Artificial Intelligence

Systematic review (SR) is a popular research method in software engineering (SE). However, conducting an SR takes an average of 67 weeks. Thus, automating any step of the SR process could reduce the effort associated with SRs. Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening. We performed an experiment where humans screened titles and abstracts for 20 papers with both original and simplified abstracts from a prior SR. The experiment with human screeners was reproduced with GPT-3.5 and GPT-4 LLMs to perform the same screening tasks. We also studied if different prompting techniques (Zero-shot (ZS), One-shot (OS), Few-shot (FS), and Few-shot with Chain-of-Thought (FS-CoT)) improve the screening performance of LLMs. Lastly, we studied if redesigning the prompt used in the LLM reproduction of screening leads to improved performance. Text simplification did not increase the screeners' screening performance, but reduced the time used in screening. Screeners' scientific literacy skills and researcher status predict screening performance. Some LLM and prompt combinations perform as well as human screeners in the screening tasks. Our results indicate that the GPT-4 LLM is better than its predecessor, GPT-3.5. Additionally, Few-shot and One-shot prompting outperforms Zero-shot prompting. Using LLMs for text simplification in the screening process does not significantly improve human performance. Using LLMs to automate title-abstract screening seems promising, but current LLMs are not significantly more accurate than human screeners. To recommend the use of LLMs in the screening process of SRs, more research is needed. We recommend future SR studies publish replication packages with screening data to enable more conclusive experimenting with LLM screening.


Explainable AI (XAI) in Biomedical Signal and Image Processing: Promises and Challenges

Yang, Guang, Rao, Arvind, Fernandez-Maloigne, Christine, Calhoun, Vince, Menegaz, Gloria

arXiv.org Artificial Intelligence

Artificial intelligence has become pervasive across disciplines and fields, and biomedical image and signal processing is no exception. The growing and widespread interest on the topic has triggered a vast research activity that is reflected in an exponential research effort. Through study of massive and diverse biomedical data, machine and deep learning models have revolutionized various tasks such as modeling, segmentation, registration, classification and synthesis, outperforming traditional techniques. However, the difficulty in translating the results into biologically/clinically interpretable information is preventing their full exploitation in the field. Explainable AI (XAI) attempts to fill this translational gap by providing means to make the models interpretable and providing explanations. Different solutions have been proposed so far and are gaining increasing interest from the community. This paper aims at providing an overview on XAI in biomedical data processing and points to an upcoming Special Issue on Deep Learning in Biomedical Image and Signal Processing of the IEEE Signal Processing Magazine that is going to appear in March 2022.


AI's Promise and Challenges for Martech

#artificialintelligence

In this article, we will discuss the ways artificial intelligence is changing marketing and why this marks a positive change. This article will also discuss how metadata can be more revealing than event data itself when collected and analyzed in aggregate, and why making all this data functional is the main strength of AI technology. 


The Promise and Challenges of AI and Machine Learning for Cybersecurity

#artificialintelligence

While cybersecurity has been an essential area of concern for most IT companies and businesses depending on technology, it is the expertise with the latest technologies like AI and Machine Learning that can give them a competitive lead in terms of information security and data safety. These days AI and Machine Learning technologies are in the limelight for many industries and use cases. Cybersecurity remains to be one of the most important beneficiaries of these new technologies. Here through the length of this post, we are going to explain the role of AI and Machine Learning for cybersecurity. In spite of having the capability of mimicking human intelligence, AI is still far short of capabilities to replace human intelligence and the ways of understanding a problem and finding solutions.


Materials Day talks examine the promises and challenges of AI and machine learning

#artificialintelligence

The promises and challenges of artificial intelligence and machine learning highlighted the Oct. 9 MIT Materials Day Symposium, with presentations …


Addressing the promises and challenges of AI

#artificialintelligence

A three-day celebration event this week for the MIT Stephen A. Schwarzman College of Computing put focus on the Institute's new role in helping society navigate a promising yet challenging future for artificial intelligence (AI), as it seeps into nearly all aspects of society. On Thursday, the final day of the event, a series of talks and panel discussions by researchers and industry experts conveyed enthusiasm for AI-enabled advances in many global sectors, but emphasized concerns -- on topics such as data privacy, job automation, and personal and social issues -- that accompany the computing revolution. Kicking off the day's events, MIT President Rafael Reif said the MIT Schwarzman College of Computing will train students in an interdisciplinary approach to AI. It will also train them to take a step back and weigh potential downsides of AI, which is poised to disrupt "every sector of our society." "Everyone knows pushing the limits of new technologies can be so thrilling that it's hard to think about consequences and how [AI] too might be misused," Reif said.


The promise and challenge of the age of AI

#artificialintelligence

Artificial intelligence promises considerable economic benefits, even as it disrupts the world of work. Three priorities will help achieve good outcomes. This new article is by James Manyika and Jacques Bughin of the McKinsey Global Institute. I hope you find it useful. The time may have finally come for artificial intelligence (AI) after periods of hype followed by several "AI winters" over the past 60 years. AI now powers so many real-world applications, ranging from facial recognition to language translators and assistants like Siri and Alexa, that we barely notice it. Along with these consumer applications, companies across sectors are increasingly harnessing AI's power in their operations. Embracing AI promises considerable benefits for businesses and economies through its contributions to productivity growth and innovation.


Apache Spark: Promises and Challenges - DZone Big Data

#artificialintelligence

If you're looking for a solution for processing huge chucks of data, then there are lots of options these days. Depending on your use case and the type of operations you want to perform on data, you can choose from a variety of data-processing frameworks, such as Apache Samza, Apache Storm…, and Apache Spark. Apache Spark is a full-fledged, data engineering toolkit that enables you to operate on large datasets without worrying about underlying infrastructure. It helps you with data ingestion, querying, processing, and machine learning, while providing an abstraction for building a distributed system. Spark is known for its speed, which is a result of improved implementation of MapReduce that focuses on keeping data in memory instead of persisting data on disk. Apache Spark provides libraries for three languages, i.e., Scala, Java, and Python.